OAuth: The Authorization Framework

Introduction#

In this lesson, we'll explore a popular framework that builds on top of the security protocols we mentioned previously and extends them to implement authorization. Such an extension is necessary because the protocols we studied earlier (HTTP authentication, API keys, JWTs) by themselves are not sufficient and have limitations, such as:

  • Using API keys as a standalone authentication mechanism isn't sufficient because they require a solution for their secure transmission.

  • Repeated transmission of tokens for authentication increases the probability of interception by an attacker.

These and other issues require that frameworks embed individual protocols to formulate a complete solution that is able to achieve security goals effectively. One benefit of having a standard authorization mechanism is that security experts can vet out complex security issues instead of everyone inventing their own solutions that might have security bugs. One of these frameworks is OAuth 2.0. OAuth 2.0, a shortened name for Open Authorization 2.0, is the preferred way of authorizing access to an API. It's a technological standard that allows applications to access resources hosted by other web apps through an API on behalf of a user.

Let's begin by expanding on OAuth 2.0.

What is OAuth 2.0?#

OAuth 2.0 is an industry-standard implementation of token-based authorization that was developed through a collaboration between Google and Twitter in 2010. It’s built on top of OAuth 1.0, which was developed in 2007. Using OAuth, we can allow access to specific information to API clients without exposing the user's credentials. This information can be used to create accounts on third-party applications without the need for sharing sensitive user credentials (like passwords).

However, OAuth does this using access tokens instead of passwords. Primarily, access tokens are used because:

  • Authorization is provided for a limited time to specific resources only.
  • User account information is shared without the transmission of authentication information such as passwords to any third party.

As a real-life example, assume that we want to create a new Spotify account using our existing Google account. In that case, Spotify allows us to sign up by clicking the "Sign up with Google" button. Spotify and Google facilitate this functionality using the OAuth 2.0 framework.

Created with Fabric.js 3.6.6
Spotify allowing account creation using existing third-party application

1 of 6

Created with Fabric.js 3.6.6
The user initiates sign up on Spotify using their Google account

2 of 6

Created with Fabric.js 3.6.6
Spotify requests information from Google

3 of 6

Created with Fabric.js 3.6.6
Google asks for and receives the user's permission to share it's information

4 of 6

Created with Fabric.js 3.6.6
Information is provided to Spotify

5 of 6

Created with Fabric.js 3.6.6
The account created with the information received

6 of 6

Access to resources requested in the API calls is provided based on the validity of the access tokens.

Actors/roles#

Four entities in OAuth communicate to achieve secure transmission of resources:

  • End user: This is the application user that owns the resources (like user credentials).

  • Resource server: This server holds the resources of the end user (the Google server holding the user's protected information in the example above).

  • Application/client: The application or client requests the user's resources from the resource server (Spotify application).

  • Authorization server: This server provides authorization to the application/client for accessing the resources (the Google authorization service).

Flows#

As stated before, OAuth uses access tokens to allow access to protected resources. Depending on the use case and actors involved, OAuth facilitates four different kinds of communication patterns to obtain the access token. These patterns are referred to as flows. We describe each of them below:

Types of authorization flows for APIs
Types of authorization flows for APIs

Based on the use case, we can match the appropriate flow. Let’s define them briefly:

  • Authorization code: This is the flow depicted in the Spotify example above, in which the user logs in and provides the required permissions. After this, a code is obtained to give us the required permissions. This code is ultimately exchanged for an access token. The access token can then be used while making the API calls.

  • Authorization code with Proof Key for Code Exchange (PKCE): This is an identical flow as the authorization code but has additional security methods. It has an extra parameter called code_verifier, which is authenticated before the API server responds to requests. Some well-known applications, such as Facebook, allow us to use PKCE code flow in their APIs for extra protection.

  • Client credentials: These are differentiated from the other flows because no permissions are needed from the end user. The access token is obtained with the client's credentials. This flow is used by applications to access their public resources, such as public playlists in music applications like Spotify.

  • Implicit grant: This is similar to the authorization code flow but has no intermediary code. Instead, we directly get an access token that has a very short lifespan and an inability to be refreshed. Nowadays, it's not recommended to use the implicit grant flow because without the intermediary code, the access token can be intercepted.

Authorization code flow#

The authorization code flow is used to distribute access tokens to those applications communicating with a server on behalf of a user. The Spotify example earlier in the lesson follows this flow.

The end user is required to log in and grant the required permissions, after which an authorization code is generated. We can use this code to get an access token, which can then be used to request data. This flow drives the communication between all four actors. Let's take a look at the slides below to understand the communication flow before describing it in detail:

Created with Fabric.js 3.6.6
The application/client redirects the user to the authorization server

1 of 6

Created with Fabric.js 3.6.6
The authorization server asks the user to grant permissions

2 of 6

Created with Fabric.js 3.6.6
The user grants the permissions

3 of 6

Created with Fabric.js 3.6.6
The authorization code is sent to the application and redirects the user to the redirect URI

4 of 6

Created with Fabric.js 3.6.6
The access token is requested by the application

5 of 6

Created with Fabric.js 3.6.6
The access token is granted

6 of 6

In the flow above, the authorization server acts as an intermediary between the end user and the application. The authorization server prompts the user for permission, through which the user logs in and grants the required permissions. The process is as follows:

  1. The application redirects the user to the authorization server. The client or application sends the user credentials in this request—such as its client ID and the scope of its request, amongst others—in the form of an authorization request to the authorization server. The authorization request may follow the template below:

The authorization request sent by the client
  1. The authorization server prompts the user to authorize the request made by the application after the authorization server validates the client's identity.

  2. The user approves or denies the request.

  3. After acquiring permissions, the authorization server sends a reply back to the application. The reply contains their authorization code (its sole purpose is to be traded for an access token) as a query parameter. Here is an example of what the authorization code may look like:

An example of an authorization code

The user is redirected to the redirect URI.

  1. The application can then use this authorization code to request an access token from the authorization server. It's important to note that the authorization server only authorizes the application, whereas the information it requests is stored on the resource server.

  2. The authorization server sends the access token to the requesting application.

Note: In the example of Spotify given above, the authorization and resource servers belong to Google, whereas the application is Spotify.

The application can now forward this access token to the resource server to request the user's credentials. After verifying the access token (checking its signature), the resource server grants the application's request and returns the required information.

Question

Why don’t we send the client the access token in the first step instead of exchanging it with the authorization code?

Hide Answer

Directly providing the client with the access token is considered insecure. To exchange the authorization code for the access token is a step added for additional security. If we consider a situation where the access token is sent to the client instead of the authorization code, it’ll be returned as a URL parameter, which is stored in the history of the user’s browser, leading to vulnerabilities. So, the second call to exchange the authorization code for an access token ensures that malicious users cannot exploit this data and also the server can ensure the clients are who they claim to be.

Authorization code with PKCE#

The standard authorization code assumes that the client devices could securely store long-term secrets and ignore man-in-the-middle attacks. PKCE extension takes care of these shortcomings. PKCE is the same flow as the authorization code but is considered more secure. PKCE was developed to address the weaknesses of the standard authorization code flow when utilized by public applications. Public applications aren’t deemed trustworthy to hold sensitive user information. So, further security measures are taken to prevent authorization code interception through which attackers can obtain access tokens and imitate the legitimate user. PKCE extension mainly functions with native mobile applications and single-page applications (SPAs) because they're public applications. Let’s suppose we have a mobile application that is requesting data from an API. It’s imperative for the mobile application to use PKCE extension. This is necessary because an attacker app has the capability to impersonate the mobile application:

The attacker intercepting the authorization code
The attacker intercepting the authorization code

In the diagram above, the attacker can exchange the authorization code for the access token an impersonate the legitimate user. This is mitigated with the use of PKCE.

This is an additional layer of security on the previous OAuth flow. The standard authorization code flow works for confidential clients who are able to store user credentials (which public applications aren't), but more and more APIs are adding PKCE due to its further security capabilities. The difference in the flows can be seen below, highlighted in red:

Created with Fabric.js 3.6.6
A code challenge is sent along with the user credentials and is stored on the authorization server

1 of 6

Created with Fabric.js 3.6.6
The authorization server asks the user to grant permissions

2 of 6

Created with Fabric.js 3.6.6
The user grants the permissions

3 of 6

Created with Fabric.js 3.6.6
The authorization code is sent to the application and redirects the user to the redirect URI

4 of 6

Created with Fabric.js 3.6.6
The access token request by the application along with the code_verifier

5 of 6

Created with Fabric.js 3.6.6
The access token is granted after the two hashes are compared

6 of 6

This flow contains two additional elements:

  • code_verifier: A randomly generated string only known to the application.

  • Code challenge: A SHA-256 hash of the code_verifier, generated by the application.


When the application requests the authorization code from the authorization server, it computes and sends the code challenge, which is stored at the authorization server. The authorization server returns the authorization code to the application, just like the standard authorization code flow.

In its access token request, along with the authorization code, the application also sends its code_verifier in plaintext (safe transmission due to TLS). The authorization server performs its own SHA-256 hash on the obtained code_verifier and compares it to the code challenge it has already stored. If they match, the application has proven its identity and an access token will be granted to it.

Therefore, even if the attacker has the authorization code, its hashed code_verifier and code challenge won't match, successfully mitigating this breach.

Implicit grant flow#

Before the creation of the authorization code flow with PKCE, the implicit grant flow was used by public applications to obtain access tokens. Now, it's used in an environment where the application is completely trusted with the user's resources. In the current API landscape, the implicit grant flow is losing favor to the authorization code flow with PKCE.

Implicit grant flow functions similarly to the authorization code flow, but instead of getting an authorization code, we directly get the access token using this flow. However, the tokens have a very short lifespan and cannot be refreshed.

This flow isn't recommended due to the risks accompanied by returning the access token this way. Since the implicit grant flow is not the most secure flow, the tokens used in this flow are short-lived, aren't refreshable, and would have to be created again when they expire.

Created with Fabric.js 3.6.6
The application/client redirects the user to the authorization server

1 of 4

Created with Fabric.js 3.6.6
The authorization servers asks the user to grant permissions

2 of 4

Created with Fabric.js 3.6.6
The user grants the permissions

3 of 4

Created with Fabric.js 3.6.6
The access token is granted instantly

4 of 4

Client credentials flow#

In a situation where we don't need access to any of the user's resources, we can obtain the access token through the client credentials flow. This is used to access public endpoints in APIs, which don't require user permissions.

An example of this is Spotify displaying a public playlist available across the application, so no user permissions will be needed to view the playlist. The application requests tokens from the server and receives them without any communication with the user. The process of the client credential workflow is shown below:

Created with Fabric.js 3.6.6
The application requests the authorization server for the access token

1 of 2

Created with Fabric.js 3.6.6
The access token is granted

2 of 2

With no special user permissions required, the access token is exchanged fairly quickly. Through this token, the application can access the desired endpoints.

The situations or optimal use cases in which the four flows can be used are detailed in the table below:

Use Cases of the Four Flows


The Client Is a Web App Executing on a Server

Doesn't Require User Permissions

SPA/Mobile App

The Application Is Absolutely Trusted with the User Resources

Flow

Authorization code/PKCE

Client credentials

Authorization code with PKCE

Implicit grant flow (if PKCE is not possible)

Conclusion#

The table below summarizes the differences between the various flows:

Differences of the Four Flows

Flows

User's Permission

Can Access or Modify User's Resources

Refreshable

Authorization code

Required

Yes

Yes

Authorization code with PKCE

Required

Yes

Yes

Client credentials

Not required

No

No

Implicit grant

Required

Yes

No

Throughout this entire process, OAuth 2.0 only serves as the authorization framework because nowhere on the access token does it specify who its owner is, so a malicious user may obtain and abuse this access token. If implemented alone, the question of authentication arises, so we tend to pair OAuth 2.0 with a relevant framework to handle these use cases. The actual authentication of the client is usually done through OpenID Connect, which we explore in the upcoming lesson.

Authentication and Authorization

Authentication and Authorization Frameworks: OpenID and SAML